Versions:

  • 0.56.0
  • 0.55.0
  • 0.54.1
  • 0.54.0
  • 0.53.0

xan 0.56.0, published by médialab Sciences Po, is a high-performance command-line CSV processor written in Rust for researchers and data engineers who need to manipulate multi-gigabyte delimited files without leaving the shell. Originally forked from BurntSushi’s xsv and subsequently almost completely rewritten, the tool now offers five released versions that integrate a SIMD-accelerated parser, multithreaded computation, and a domain-specific expression language optimized for tabular data. Typical social-science workflows at Sciences Po—such as filtering millions of scraped web records, joining lexical tables, aggregating temporal survey waves, or sorting bioinformatics .vcf, .gtf, .sam and .bed files—are handled through composable sub-commands that can be piped together like standard Unix utilities. Beyond classical slicing, sorting, joining and aggregating, xan supplies modules for lexicometric text counts, graph-theoretic network extraction, and even lightweight scraping, making it equally relevant for digital humanities, web-archival .cdx analysis, and high-throughput genomic pipelines. The embedded expression evaluator runs markedly faster than embedded Python or JavaScript while keeping memory usage minimal, letting a single workstation process datasets that normally require distributed tools. Because all functions are designed to be chainable, complex reproducible transformations can be stored as one-liner scripts that run identically on macOS, Linux, or Windows terminals. The software is available for free on get.nero.com, with downloads provided via trusted Windows package sources (e.g. winget), always delivering the latest version, and supporting batch installation of multiple applications.

Tags: